Conformational gel analysis and graphics: Measurement of side chain rotational 
isomer populations by NMR and molecular mechanics 
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Conformational gel analysis and graphics systematically identifies and evaluates plausible alter- 
natives to the side chain conformations found by conventional peptide or protein structure determi- 
nation methods. The proposed analysis determines the populations of side chain rotational isomers 
and the probability distribution of these populations. The following steps are repeated for each side 
chain of a peptide or protein: first, extract the local molecular mechanics of side chain rotational 
isomerization from a single representative global conformation; second, expand the predominant set 
of rotational isomers to include all probable rotational isomers down to those that constitute just 
a small percentage of the population; and third, evaluate the constraints vicinal coupling constants 
and NOESY cross relaxation rates place on rotational isomer populations. In this article we apply 
conformational gel analysis to the cobalt glycyl-leucine dipeptide and detail the steps necessary to 
generalize the analysis to other amino acid side chains in other peptides and proteins. For a side 
chain buried within a protein interior, it is noteworthy that the set of probable rotational isomers 
may contain one or more rotational isomers that are not identified by conventional NMR struc- 
ture determination methods. In cases such as this the conformational gel graphics fully accounts 
for the interplay of molecular mechanics and NMR data constraints on the population estimates. 
The analysis is particularly suited to identifying side chain rotational isomers that constitute a 
small percentage of the population, but nevertheless might be structurally and functionally very 
significant. 



I. INTRODUCTION 

Simple changes in functional groups can often make 
multiple order-of-magnitude changes in biological activ- 
ity [ij- This suggests that protein conformations popu- 
lated at the 10%, 1.0%, or even 0.1% level may play a 
very significant role in function. 

The most common NMR protein structure determi- 
nation methods generate perhaps a few dozen complete 
protein structures 2]. The structures in this ensemble 
are independently fit to the data and each final struc- 
ture should fit the data about equally well. The opti- 
mization of a multiconformer model differs sharply from 
these standard fitting methods because the measure of 
goodness-of-fit is not defined for any single structure in 
the ensemble, but depends simultaneously on the entire 
ensemble of structures. In contrast to the standard meth- 
ods, no single structure in the final multiconformer en- 
semble need be a particularly good fit to the data 
In favorable cases multiconformer models can give some 
indication of the conformational variability of proteins in 
solution For example, particular secondary structure 
elements, loops, or even side chains of the multiconformer 
model might have a larger RMS deviation indicating vari- 
ability of these parts in solution. Information about local 
variability thus comes from a global fitting procedure. 
Whether conformational variability is assessed by con- 
ventional methods or multiconformer models, the essen- 
tial idea is to narrow down the vast global conformational 
space of a protein by applying the constraints of real data. 
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This point also marks one of the key differences of confor- 
mational gel analysis because conformational gel analysis 
identifies local conformations based solely on molecular 
mechanics and a single representative global conforma- 
tion previously determined from current or preexisting 
NMR data or even crystallographic data. Instead of ap- 
plying NMR data to pick out new global conformations, 
the NMR data is analyzed to determine the extent to 
which it constrains the populations of the local confor- 
mations identified by molecular mechanics. 

Detailed information about local conformations is of- 
ten available from molecular mechanics. In the case of 
protein side chain rotational isomerization, good esti- 
mates of the position and shape of potential energy wells 
are known and approximate depths of these wells are also 
known Q. Even in cases where such information might 
be exploited to reduce or eliminate the conformational 
search problem, its potential use is overlooked or even 
dismissed, apparently because the experimental data is 
judged more reliable than the molecular mechanics re- 
sults 0, the molecular mechanics geometries and ener- 
gies may be judged globally coupled to such an extent 
that local conformational information can not be sep- 
arated out 0, H, or perhaps because the molecular 
mechanics models are not readily available [13, EH ■ 

The cobalt glycyl-leucine dipeptide [l^] NMR data an- 
alyzed in this article was previously analyzed in a more 
comprehensive and perhaps less accessible manner fl3{. 
The present analysis of this data is not designed to ex- 
tend the knowledge of the cobalt dipeptide system, but 
rather to be readily generalizable to proteins. The cobalt 
dipeptide analysis is particularly suited to generalization 
to proteins because the cobalt chelate ring system fixes 
the dipeptide backbone in a definite conformation and 
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FIG. 1: Stereo views of the predominant rotational isomers 
of the leucine side chain of the cobalt glycyl-leucine dipep- 
tide: top, trans gauche^; middle, gauche" trans; and bottom, 
gauche^ gauche The atom gray scale tones are: white, hy- 
drogen; light gray, carbon; medium gray, nitrogen; dark gray, 
oxygen; black, cobalt. The leucine side chain projects out- 
ward towards the viewer and the three chelated nitro groups 
are visible below, behind, and above the cobalt dipeptide ring 
system. 

the simplifying assumption of a single backbone confor- 
mation also applies to the analysis of proteins. 

II. CONFORMATIONAL GEL ANALYSIS AND 
GRAPHICS 

Conformational gel analysis and graphics provides de- 
tailed molecular mechanics population estimates and as- 
sesses the constraints that NMR data places on these 
population estimates. Much of this information is ex- 
pressed in the form of gel graphics. Though gels are 
widely applied to the separation, characterization, and 
identification of all types of biomolecules and their de- 
veloped images are widely seen in the biochemical liter- 
ature, the gel graphics employed here are entirely com- 
puter generated and their interpretation is in many ways 
novel. The basic facts about their interpretation are per- 
haps most easily explained by showing their connection 
to molecular mechanics energy plots. We will make this 
connection in the opening paragraphs of this section in 
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FIG. 2: Molecular mechanics energy schematic for the three 
predominant leucine side chain rotational isomers. Isomeriza- 
tion energies are plotted as a function of the torsion angle 
for rotation about the leucine side chain a to /3-carbon bond. 
The horizontal axis labels refer to only the torsion angle 
of each rotational isomer. Solid, molecular mechanics energy 
function; dashed, NMR experimental data energy function; 
dotted, energy function distribution from estimated molecular 
mechanics errors. This energy function distribution displays 
essentially the same information as a gel graphic. 



a highly simplified analysis of the three predominant ro- 
tational isomers of the leucine side chain of the cobalt 
dipeptide (Figure [T|) and then move on to a detailed ex- 
planation of conformational gel analysis and graphics in 
the three numbered subsections of this section. 

A gel graphic can convey the uncertainty of rotational 
isomer populations obtained either by molecular mechan- 
ics calculations or by fitting NMR data. The distinction 
between calculated or fit populations and the uncertain- 
ties of these populations parallels that between a sim- 
ple energy function and an energy function distribution 
(Figured]). In this example all the energy functions give 
energies for rotation about the leucine side chain a to (3- 
carbon bond. The three troughs of each sinusoidal func- 
tion are the energy wells of the three predominant rota- 
tional isomers and the three crests are the energy barriers 
to interconversion. Note that the molecular mechanics 
energy function (Figure [21 solid) can be calculated from 
the full X energy map (Figurein]) by rotating the 
torsion angle and as appropriate adjusting the torsion 
angle in such a way as to pass over the energy barriers 
separating the three predominant rotational isomers, see 
methods. 

The molecular mechanics energy function is calculated 
from an empirical energy function, which has many pa- 
rameters, such as bond length and bond angle equilib- 
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FIG. 3: Molecular mechanics energy map for rotational iso- 
merization of cobalt dipeptide leucine side chain. Contour 
levels are dashed, 1, 3, 5, 7, 9; solid, 2, 4, 6, 8, 10; dotted, 15, 
20 kcal/mol. Zero corresponds to —39.4 kcal/mol. The nine 
rotational isomers are labeled at the position of their energy 
well minima. 



rium values, torsion angle phases and multiplicities, force 
constants, atomic partial charges, and Lennard-Jones 
14{ . These parameters are fit to theoretical 



constants 

and experimental data for model compounds. Errors are 
introduced by this data, by the necessary simplicity of 
an empirical energy function, and by the need to trans- 
fer parameters from the simple model compounds to a 
larger molecule of interest, such as the cobalt dipeptide. 
The X energy map on which this example is based 
is calculated by energy minimization with the ^'Hd 
torsion angles constrained and without any explicit sol- 
vent. This makes for quick calculation, but introduces 
further errors because there is no averaging over other 
cobalt dipeptide internal degrees of freedom nor over any 
solvent degrees of freedom. The molecular mechanics en- 
ergy function (Figure [2l solid) is a best overall estimate 
of the rotational isomerization energy and each energy 
function in the distribution of energy functions (Figure 
[21 dotted) represents a possible deviation from this best 
estimate due to all the above errors. 

The Boltzmann factor establishes a relation between 
the energy and the population of a state. The connec- 
tion between rotational isomer population uncertainties 
and an energy function distribution follows from this re- 
lation. For example the energies at the trans, gauche^, 
and gauche"*" well minima of the molecular mechanics en- 
ergy function are 0.5, 0.4, and 2.4 kcal/mol (Figure [H 
solid) and at 300 degrees K the Boltzmann factors for 
these energies give populations of 45, 53, and 2% for the 
three predominant rotational isomers. It is also impor- 
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FIG. 4: Molecular mechanics gel graphic for the cobalt dipep- 
tide leucine side chain. The gel graphic is calculated from the 
molecular mechanics energy map in the absence of the ex- 
perimental data and identifies the probable set of rotational 
isomers for further conformational gel analysis. The popu- 
lation probability distributions shown in the gel graphic are 
calculated for an uncertainty of the map energy well depths 
of ±1.0 kcal/mol and a temperature of 300 K. Each gray scale 
step of the stepwedge bar corresponds to a two- fold change in 
probability density. 



tant to note that though it is meaningful to talk about the 
energy of rotational isomerization of a single molecule, 
the population must refer to an ensemble average over 
many molecules or at least to a time average for a single 
molecule. Even though there is an ensemble of molecules 
there is no uncertainty in the populations predicted by 
the simple energy function. The population probability 
distributions (Figure [H lanes 5-7) only arise from the 
distribution of energy functions (Figure [2l dotted). Note 
that probability distributions for the trans, gauche", and 
gauche"*" predominant rotational isomers in this example 
are only slightly affected by the other rotational isomers 
in the probable set (Figure [4l lanes 1-4, 8, and 9) be- 
cause the probability distributions of all these other ro- 
tational isomers favor very low populations. Each lane 
in the molecular mechanics gel graphic is itself a proba- 
bility distribution for one rotational isomer and the total 
probability for each rotational isomer is always normal- 
ized to one. If the uncertainties in the well depths are all 
zero, then the population of each rotational isomer is pre- 
cisely determined. In contrast, nonzero uncertainties in 
the well depths give population probability distributions, 
which appear on the gel graphic as either broadened or 
extended bands. The molecular mechanics gel graphic 
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FIG. 5: Conformational gel graphic for the entire probable 
set of rotational isomers of the cobalt dipeptide leucine side 
chain. The gel graphic visually portrays the extent to which 
the NMR data constrains the populations of the probable set 
of isomers. The population probability distributions shown in 
the gel graphic are constructed by repeated fitting of the ro- 
tational isomer populations to Monte Carlo NMR data. Each 
gray scale step of the stepwedge bar corresponds to a two-fold 
change in probability density. 
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FIG. 6: Conformational gel graphic for the predominant 
set of rotational isomers of the cobalt dipeptide leucine side 
chain. The restriction of the probable set of rotational iso- 
mers to those that molecular mechanics suggest are predom- 
inant clearly reduces the uncertainty in the populations. For 
a protein side chain such a restriction is not always desirable 
because less conspicuous rotational isomers might be struc- 
turally and functionally very significant. 



gives explicit visual representation to the errors inherent 
in molecular mechanics. 

The interpretation of conformational gel graphics (Fig- 
ures [5] and ^ is in many respects similar to that of 
a molecular mechanics gel graphic as described above. 
Again we will focus on the populations estimates for the 
three predominant rotational isomers (Figure [51 lanes 5- 
7) and leave detailed consideration of the entire probable 
set of rotational isomers to a later subsection. If a model 
with only the trans, gauche", and gauche^ predominant 
rotational isomers is fit to the NMR data for the cobalt 
dipeptide, the predicted populations of these rotational 
isomers are 21, 54, and 25%. Though it is natural to fit 
the NMR data by adjusting rotational isomer populations 
rather than well energies, these fit populations may be 
converted to energies by inverting the Boltzmann factor 
procedure described in the previous paragraph. At 300 
degrees K the fit populations are equivalent to a simple 
energy function with trans, gauche", and gauche"*" en- 
ergy well minima at 0.9, 0.4, and 0.8 kcal/mol (Figure [H 
dashed). The standard Monte Carlo method for estimat- 
ing the distribution of the rotational isomer population 
estimates assumes that the estimated populations are the 
true populations, generates a large number of simulated 
NMR data sets, and fits these simulated NMR data sets 



to genterate the population distribution, see methods. 
The gel graphic (Figure [HI lanes 5-7) displays this pop- 
ulation distribution. By the inverted Boltzmann factor 
procedure the population distribution could be converted 
into an energy function distribution and plotted in an 
energy schematic. In the same way that the molecular 
mechanics gel graphic (Figure [4l lanes 5-7) has an equiv- 
alent energy function distribution (Figure [21 dotted) so 
also does a conformational gel graphic (Figure [SI lanes 5- 
7) have an equivalent energy function distribution (not 
shown). Only the origins of the distributions differ. The 
molecular mechanics energy function distribution is gen- 
erated directly by applying Monte Carlo energy errors to 
a simple molecular mechanics energy function, but the 
energy function distribution equivalent to a conforma- 
tional gel is generated from the underlying rotational iso- 
mer population distribution, which is in turn indirectly 
generated from Monte Carlo NMR data errors as outlined 
above. In short, the conformational gel graphic gives ex- 
plicit visual representation to the errors inherent in the 
NMR data. 

We now turn to a detailed explanation of the confor- 
mational gel analysis and graphics of the cobalt dipeptide 
and along the way indicate how this analysis can be gen- 
eralized to proteins. 
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1. Extract local molecular mechanics from a single 
representative global conformation 

In this work we confine our attention to the rotational 
isomerization of peptide and protein side chains as ex- 
amples of local molecular mechanics. The cobalt dipep- 
tide energy map for the rotational isomerization of the 
leucine side chain (Figure [3]) is a very simple example 
of local molecular mechanics. To calculate this map the 
leucine side chain torsion angles are constrained to val- 
ues on a 5 degree grid over the ^ torsion space 
and the entire dipeptide structure is energy minimized 
[ist . To extract local molecular mechanics of proteins lo- 
cal backbone flexibility and local side chain interactions 
must be carefully controlled A single representative 
global conformation can be routinely extracted from an 
ensemble of NMR protein structures by averaging and 
constrained minimization 

Though only the side chain conformation at an energy 
minimum of a well is required for the gel analysis, the en- 
tire map is useful for automated identification of the en- 
ergy minima and is absolutely essential for correctly con- 
trolling the local backbone flexibility and neighbor side 
chain interactions. To control the backbone flexibility the 
entire protein backbone is fixed except for backbone seg- 
ments of two or at most three amino acids. Essentially, 
the number and length of these free atom segments must 
be sufficient to accurately determine the position of the 
energy well minima, but this number and these lengths 
must not be so unduly generous as to make energy min- 
imizations unnecessarily expensive or as to obscure the 
energy map with transitions of nonlocal backbone degrees 
of freedom. The energy map is very effective tool for 
eliminating these nonlocal transitions because they show 
up as discontinuities of the energy surface. If a map has 
these discontinuities then it must be recomputed with re- 
duced backbone flexibility. The accuracy of the energy 
map can be judged by comparing the position, shape and 
depth of energy wells of maps computed at two or three 
different levels of backbone flexibility. 

The effects of neighbor side chain interactions are 
assessed by truncating neighbor side chains at the (5- 
carbon. By comparing the shape and position of en- 
ergy wells of maps calculated with and without neighbor 
side chain truncation it is possible to gauge the extent 
to which particular interactions influence particular side 
chain rotational isomers. In rare cases a neighbor side 
chain interaction may be judged to preclude a particular 
energy well at any potentially interesting level of pop- 
ulation. More commonly a neighbor side chain interac- 
tion will simply increase the uncertainty of a target well's 
energy depth because the uncertainty of the interaction 
strengths of the target well with all of the neighbor side 
chain wells and the uncertainty in the energy depth of all 
the neighbor wells must be folded into the uncertainty of 
the target well's depth. 

The energy map for the leucine side chain of the cobalt 
dipeptide (Figure [3]) is similar to maps computed for side 



chains in a variety of protein environments. One im- 
portant similarity is that during calculation of the side 
chain energy map the cobalt dipeptide backbone is flxed 
in a definite conformation by the cobalt chelate ring sys- 
tem. This parallels the approach to calculating a protein 
side chain energy map described above, where the protein 
backbone is fixed in the same conformation as found in 
the single representative global conformation. The cobalt 
dipeptide and protein maps both give the local molecu- 
lar mechanics of side chain rotational isomerization. The 
leucine map of the cobalt dipeptide is also similar to pro- 
tein side chain maps in that the total number of wells 
present in the map is the same or close to the number of 
ideally possible rotational isomers, for example, nine in 
the case of a leucine side chain. This is true even for a 
side chain buried within a protein because neighbor side 
chains are usually truncated at the /3-carbon in order not 
to exclude target side chain conformations that interact 
unfavorably with the neighbor side chain conformation 
that happens to exist in the single representative global 
conformation of the protein. As already mentioned the 
uncertainty of the energy depth of such target side chain 
wells is substantially increased. The positions of the well 
energy minima on the energy map for the leucine side 
chain of the cobalt dipeptide differ from their ideal posi- 
tions by from about 10 degrees (trans gauche+ rotational 
isomer) to almost 60 degrees (trans trans rotational iso- 
mer). The average departure from ideal positions tends 
to increase for side chains buried within proteins. 



2. Expand the predominant set to include all probable 
rotational isomers 

The single representative global conformation deter- 
mined by conventional methods already gives a preferred 
conformation for each side chain of a protein. For each 
side chain it is usually possible to adjoin the preferred 
rotational isomer with one or two others and form a 
predominant set of rotational isomers that account for 
say 90% of the population of the side chains rotational 
isomers. The side chain rotational isomers included in 
this predominant set might be present in an ensemble of 
conformations determined by conventional NMR, would 
probably have low energy wells in the molecular mechan- 
ics energy map, and might possibly be suggested by ro- 
tamer preference libraries complied from the protein data 
bank. There are several different scenarios that may arise 
and these are best illustrated by referring to the cobalt 
dipeptide molecular mechanics gel graphic (Figure (H). 

The molecular mechanics gel graphic plays an impor- 
tant role both in identifying the predominant set of rota- 
tional isomers and in expanding this set to make the set of 
all probable rotational isomers. As discussed in the open- 
ing paragraphs of this section this gel graphic differs from 
the energy map in that it takes into account not only the 
energy depth of each rotational isomer's well, but also the 
uncertainty of the energy well depths. A comparison of 
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cobalt dipeptide gel graphic and gel graphics computed 
for side chains in a variety of protein environments follows 
along much the same lines as the comparisons between 
energy maps in the previous section. Like a protein en- 
ergy map, the resulting protein molecular mechanics gel 
reflects the local molecular mechanics of the target side 
chain. Just as the protein energy map usually has en- 
ergy wells corresponding to each of the ideally possible 
rotational isomers, the resulting protein molecular me- 
chanics gel has the same corresponding lanes. Unlike the 
well depth uncertainties of the cobalt dipeptide molecu- 
lar mechanics gel, which are all equal because there are 
no neighbor side chains, the uncertainties of a protein gel 
would be significantly larger for rotational isomers that 
interact strongly with neighbor side chains. It is difficult 
to draw reliable conclusions from the molecular mechan- 
ics energy map without taking these uncertainties into 
account. 

From the cobalt dipeptide molecular mechanics gel 
graphic (Figure [4|) the predominant rotational isomers 
of the leucine side chain are trans gauche"*" and gauche" 
trans and possibly gauche"*" gauche"*". Analysis of NMR 
data suggests that these first two rotational isomers are 
the most populated and that the third may also make 
a significant contribution p^ . This agreement between 
experiment and molecular mechanics comes about be- 
cause the experimental analysis is designed to reproduce 
the rotational isomer preferences observed in the pro- 
tein data bank , which not coincidently are much the 
same as the relative rotational isomer stabilities predicted 
by molecular mechanics. We expect that for many pro- 
tein side chains the same set of predominant rotational 
isomers will be given by the molecular mechanics gel 
graphic and the ensemble of conformations determined 
by conventional NMR, but for slightly different reasons 
than those just mentioned for the cobalt dipeptide leucine 
side chain. Perhaps the most important reason that the 
molecular mechanics energy map is likely to predict the 
same predominant rotational isomers is that it is com- 
puted with the backbone fixed in the same conformation 
as found in the single representative global conformation. 
The side chain energy map is thus likely to favor the rota- 
tional isomers populated in the ensemble of conventional 
NMR conformations because the map is computed with 
the backbone fixed in a conformation that is representa- 
tive of this very same ensemble. 

The set of all probable rotational isomers includes all 
the predominant rotational isomers as well as those that 
make smaller contributions to the population down to 
contributions as small as perhaps a few tenths of a per- 
cent. The molecular mechanics gel graphic (Figure [4]) 
for the cobalt dipeptide leucine side chain suggests that 
all rotational isomers except gauche"*" gauche" are in the 
probable set. This is shown visually by a single band 
at zero population and a complete lack of any upwards 
extension of this band. As mentioned earlier in this sec- 
tion the molecular mechanics gel graphics of protein side 
chains have lanes corresponding to most of the ideally 



possible rotational isomers. Even for a side chain buried 
in a protein interior a good fraction of these will at least 
be in the probable set of rotational isomers. Compared to 
the leucine side chain of the cobalt dipeptide the distinc- 
tion between predominant and probable sets of rotational 
isomers may not be as clear cut because of the increased 
energy uncertainty of rotational isomers that strongly in- 
teract with a neighbor side chain. A large uncertainty 
shows up on the molecular mechanics gel graphic as a 
lane with bands at population zero and one and a much 
fainter extensions stretching between the these two ex- 
tremes. Rotational isomers with such a large molecu- 
lar mechanics energy uncertainty should certainly be in- 
cluded in the probable set and may even be sufficiently 
populated to be included in the predominant set Q. 



3. Evaluate the constraints vicinal coupling constants and 
NOESY cross relaxation rates place on rotational isomer 
populations 

Thus far we have described the molecular mechanics 
of the leucine side chain of the cobalt dipeptide, the se- 
lection of predominant and probable rotational isomers, 
and how this can be generalized to side chains of pro- 
teins. The key point about the generalization to proteins 
is that the molecular mechanics remains local. By local 
it is meant that the molecular mechanics model depends 
on the single representative global conformation, which is 
always readily available from the preexisting NMR struc- 
ture. The side chain molecular mechanics does not de- 
pend on a multiconformer model, which would in some 
way require the simultaneous solution of all the side chain 
rotational isomer populations. In this section the molec- 
ular mechanics model is fit to the NMR data to evaluate 
the populations of the predominant and the probable ro- 
tational isomers. This can be generalized to proteins by 
exploiting the locality of both the molecular mechanics 
model and NMR data to carry out the evaluation inde- 
pendently for each side chain. 

The conformational gel graphic for the probable set 
of rotational isomers of the cobalt dipeptide leucine 
side chain (Figure [5]) shows that the experimental data, 
which consists of eight vicinal coupling constants and ten 
NOESY cross relaxation rates Jl3| , places little constraint 
on the populations of these eight isomers. This is indi- 
cated by the intense bands extending from zero popula- 
tion up to thirty to fifty percent population for each of the 
rotational isomers in the probable set. A comparison of 
the molecular mechanics and conformational gel graphics 
(Figures [4] and [5]) gives a striking graphical portrayal of 
the dramatic variation in the usefulness of molecular me- 
chanics and NMR data for determining rotational isomer 
populations. Apparently, the populations of most of the 
cobalt dipeptide side chain rotational isomers are best de- 
termined either by the NMR data alone or by the molec- 
ular mechanics calculations alone. Except for the three 
more extended lanes (gauche"*" gauche"*", trans gauche"*", 
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and gauche" trans) near the middle of the molecular me- 
chanics gel graphic (FigureU]) all the rest of the rotational 
isomers have bands at zero population that at most have 
a relatively small upwards extension. These rotational 
isomer populations are better determined by molecular 
mechanics. In contrast the trans gauche^ and gauche" 
trans rotational isomers have bands stretching all the way 
from population zero up to one in the molecular mechan- 
ics gel graphic (Figure H]) as compared to much less ex- 
tended bands in the conformational gel graphic (Figure 
[5]). These rotational isomer populations are better de- 
termined by fitting the NMR data. The similarity of the 
gauche"*" gauche"*" rotational isomer bands in both figures 
suggests both molecular mechanics and NMR data must 
be taken into account to determine the population of this 
rotational isomer. All these conclusions are born out by 
more detailed analysis p^ . 

Clearly it is desirable to reduce the size of the probable 
set of rotational isomers to the point that the NMR data 
does place significant constraints on the populations. In 
the case of the cobalt dipeptide this point is reached when 
the probable set is reduced to the three rotational isomers 
of the predominant set defined in the previous section. 
The conformational gel graphic for the predominant set 
(Figure [6]) displays population errors that are somewhat 
to considerably smaller than the predicted populations of 
the rotational isomers. This is displayed visually by the 
modest to large distance from the zero population hori- 
zontal grid line and to the beginning of the high density 
region of the bands. Note that the bands extend nearly 
three standard deviations above and below the mean, but 
the high density region extends only about two standard 
deviations in each direction. Comparing the conforma- 
tional gel graphics for the probable and the predominant 
sets of rotational isomers (Figures [5] and O a particu- 
larly striking improvement is seen in the significance of 
the population estimate of the gauche^ trans rotational 
isomer. 

For any protein side chain it is in principle possible 
to define a predominant set of rotational isomers that 
is just small enough to yield significant population es- 
timates. The practical difficulty with this is that the 
molecular mechanics does not always give reliable order- 
ing of the energy well depths because of the sources of 
error discussed in the previous section. As the size of 
the probable set is reduced it will not always be clear 
which rotational isomers to include or exclude. All three 
gel graphics (Figures IH [Sj and ^ must be considered 
together to obtain a complete picture of the rotational 
isomer populations that fully accounts for the interplay 
of molecular mechanics and NMR data. A still more 
detailed analysis should also consider measurability and 
over-fitting of rotational isomer populations [Tsj , but an 
easily accessible description of the application of these 
concepts is beyond the scope of this article. 



III. CONCLUSIONS 

This work makes new theoretical predictions of interest 
to a broad range of chemists studying the structure and 
function of proteins or other complex molecules. Particu- 
larly important is the prediction that the local molecular 
mechanics of protein side chains can be extracted from 
a single representative global conformation determined 
by conventional methods. The local molecular mechan- 
ics can identify low population though potentially func- 
tional rotational isomers of buried protein side chains. 
Conformational gel analysis and graphics is an important 
new tool for display and understanding of conformational 
population estimates and of the sources and level of er- 
rors in these estimates. By helping us see more clearly 
the extent of both our knowledge and our ignorance we 
hope to fuel the demand and inspire and guide the devel- 
opment of more powerful NMR instruments and analysis 
methods. 



IV. METHODS 

Detailed descriptions as well as working computer 
input files for calculating molecular mechanics energy 
maps, fitting NMR data, and generating gel graphics, 
have been previously published 0, [l^. Briefly, cus- 
tom topology and parameter input files were created 
and X energy maps for the leucine side chain of 
the cobalt dipeptide were calculated with the CHARMM 
molecular mechanics program [l6| . Based on the posi- 
tions of the energy well minima nine energy minimized 
rotational isomers were prepared and interatomic dis- 
tances and torsion angles for modeling cross relaxation 
rates and vicinal coupling constants were output with 
the CHARMM correlation and time series analysis com- 
mand. The optimization design matrix was obtained by 
a MATLAB function file that input a list of NOESY cross 
relaxing protons and vicinally coupled spins, read in the 
appropriate distance and angle data files output by the 
analysis command, calculated the cross relaxation rates 
and the vicinal coupling constants for each rotational iso- 
mer, and normalized each of these observables by a com- 
posite experimental and cross relaxation rate or Karplus 
coefficient error. Note that the original version of this 
MATLAB function file was somewhat more complex 
than described here because it was designed to examine 
the effects of intramolecular motions by averaging over 
the molecular mechanics energy map. The rotational iso- 
mer populations were fit by minimizing the differences 
between the experimentally measured and predicted ob- 
servables subject to the constraints that the populations 
were nonnegative and that their sum was one. This linear 
least-squares with linear constraints problem was solved 
as the equivalent quadratic programming problem [l7[. 
The probability density functions of the fit rotational iso- 
mer populations were computed by the standard Monte 
Carlo recipe [l^: the experimental NMR observables 
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were fit to yield fit rotational isomer populations and 
fit NMR observables, random errors were added to the 
fit NMR observables and these simulated NMR observ- 
ables were fit to give simulated fit populations, these last 
steps were repeated may times to make the Monte Carlo 
probability density functions of the populations. 

The energy functions in the energy schematic were ob- 
tained by cubic interpolation from the positions of the 
energy minima and maxima. The derivatives of the in- 
terpolating functions were constrained to zero at the po- 
sitions of these minima and maxima. Each energy func- 
tion was shifted by an energy constant so that at 300 de- 
grees K the sum of the Boltzmann factors of the energy 
minima equalled one. The molecular mechanics energy 
function was obtained from the molecular mechanics en- 
ergy map by matching the energies of the predominant 
rotational isomer minima and the energies of the low- 
est energy barriers between the predominant rotational 
isomers. The horizontal positions of the energy function 
minima and maxim were adjusted somewhat to make the 
maximum slopes of the energy barriers about equal while 
still approximatly matching the rotation of the torsion 
angle. The horizontal axis of the energy schematic is la- 
beled steric rotation angle rather than torsion angle 
to reflect this approximation and to emphasize the steric 
relationship between the side chain atoms rather than 
the exact rotation angle. 

The molecular mechanics gel graphic was generated 
from the energy map by a simple Monte Carlo proce- 
dure. Random energy errors were added to the rota- 
tional isomer well depths and Boltzmann weighted pop- 
ulations were calculated and normalized, these steps were 
repeated many times, and the resulting large set of sim- 
ulated rotational isomer populations was histogramed to 
make Monte Carlo probability density functions of the 
populations. 



Monte Carlo probability density functions were dis- 
played as gel graphics, which were designed to visually in- 
dicate both the discrete probability fraction at zero pop- 
ulation and shape of the continuous probability density 
over the range of population from zero and one. This was 
accomplished by a simulated photographic process where 
the degree of film overexposure indicates the probability 
fraction at zero population and continuous gray tones 
represent the continuous part of the probability density. 
To simulate film overexposure at zero population and 
smooth the lane edges along the continuous part of the 
probability distribution a Gaussian blur filter was applied 
to the image so that the typical probability density at 
zero population was still considerably greater than that 
along the continuous part of the probability distribution. 
The pixel values of the blurred image were treated like 
scene luminances and converted into photographic 
print densities with a characteristic curve '20| that had a 
maximum point-gamma of 1.5. The print densities were 
linearly mapped into gray scale values so that maximum 
printable density was somewhere along the continuous 
part of the probability distribution. A stepwedge bar of 
the 11 zones in the Zone System [2l| was added to the gel 
graphic as an aid to calibrating the probability densities. 

Vector PostScript molecular graphics were generated 
with the RasMol program [2^. 
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